NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Active Advantage-Aligned Online Reinforcement Learning with Offline Data

Liu, X; Le, HT; Chen, S; Stevens, R; Yang, Z; Walter, MR; Chen, Y (July 2025, Exploration in AI Today Workshop at ICML (ExAI), July 2025.)

Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but often produces suboptimal results due to limited data coverage. Recent efforts integrate offline and online RL in order to harness the advantages of both approaches. However, effectively combining online and offline RL remains challenging due to issues that include catastrophic forgetting, lack of robustness to data quality and limited sample efficiency in data utilization. In an effort to address these challenges, we introduce A3RL, which incorporates a novel confidence aware Active Advantage Aligned (A3) sampling strategy that dynamically prioritizes data aligned with the policy's evolving needs from both online and offline sources, optimizing policy improvement. Moreover, we provide theoretical insights into the effectiveness of our active sampling strategy and conduct diverse empirical experiments and ablation studies, demonstrating that our method outperforms competing online RL techniques that leverage offline data. Our code will be publicly available at:this https URL.
more » « less
Free, publicly-accessible full text available July 13, 2026
Integrating resource-efficiency into photonics education: new course development and redefining design training for a holistic approach

https://doi.org/10.1117/12.3076841

Serna, S; Liu, X; Nagarkar, P; Ono, T; Huttner, E; Unger, B; Saini, S; Kimerling, L; Agarwal, A (July 2025, SPIE)
Kallepalli, Akhil (Ed.)
As the semiconductor and photonics industries grapple with mounting business pressures, weaving resourceefficiency into engineering education has evolved from a priority to an imperative. Under the umbrella of FUTUR-IC, this paper highlights novel pedagogical strategies at Bridgewater State University (BSU) aimed at equipping photonics and optical engineers to address today’s ecological challenges. We detail two complementary approaches that together form a cohesive educational framework. The first involves a newly introduced fresh year-level seminar on Resource Efficient Microchip Manufacturing, which immerses students in resource-efficiency metrics such as Life Cycle Intelligence and “design for resourceefficiency” principles. By interlinking photonic integration concepts with tangible business impact assessments, this course fosters an early appreciation of how advanced technologies can be developed responsibly, with reduced energy consumption and minimized waste. The second approach redefines senior-level engineering design courses to embed multifaceted resourceefficiency criteria in the design process. Through project-based learning and collaboration with industry partners, students integrate photonic solutions with data-driven metrics, refining their ability to propose holistic prototypes. These initiatives go beyond technical mastery to cultivate interdisciplinary collaboration and critical thinking. This work illustrates how an integrated approach to engineering education can spark the next generation of practitioners to design for both technological excellence and business viability.
more » « less
Free, publicly-accessible full text available July 7, 2026
Exploring students’ interest-driven patterns of scientific observation in Minecraft

Wei, Z; Nasiar, N; Zambrano, AF; Liu, X; Ocumpaugh, J; Barany, A; Baker, RS; Giordano, C (June 2025, Proceedings)

Students bring different levels of interest to learning experiences, which impacts how they engage with learning materials. This study aims to understand the relationship between student's interest levels and their scientific observation behaviors within a Minecraft-based learning system. Motivated by the growing interest in integrating human-AI collaboration within educational research, we combine the capabilities of Large Language Models (LLMs) with the expertise of human researchers to capture the emerging themes within students’ observations. Using epistemic network analysis, we then visualized and compared the observational patterns of students with high and low situational interest. Our findings indicate that students with higher situational interest tend to make observations across a broader range of topics, with a particular emphasis on scientific content. These results highlight the potential for developing timely interventions to support students with low situational interest.
more » « less
Free, publicly-accessible full text available June 13, 2026
Training Large Recommendation Models via Graph-Language Tokens Alignment

Yang, M; Liu, Z; Yang, L; Liu, X; Wang, C; Peng, H; Yu, PS (April 2025, WWW (Campanion))

Free, publicly-accessible full text available April 28, 2026
Contextual Active Model Selection

Liu, X; Xia, F; Stevens, R; Chen, Y (December 2024, The Thirty-eighth Annual Conference on Neural Information Processing Systems)

While training models and labeling data are resource-intensive, a wealth of pre-trained models and unlabeled data exists. To effectively utilize these resources, we present an approach to actively select pre-trained models while minimizing labeling costs. We frame this as an online contextual active model selection problem: At each round, the learner receives an unlabeled data point as a context. The objective is to adaptively select the best model to make a prediction while limiting label requests. To tackle this problem, we propose CAMS, a contextual active model selection algorithm that relies on two novel components: (1) a contextual model selection mechanism, which leverages context information to make informed decisions about which model is likely to perform best for a given context, and (2) an active query component, which strategically chooses when to request labels for data points, minimizing the overall labeling cost. We provide rigorous theoretical analysis for the regret and query complexity under both adversarial and stochastic settings. Furthermore, we demonstrate the effectiveness of our algorithm on a diverse collection of benchmark classification tasks. Notably, CAMS requires substantially less labeling effort (less than 10%) compared to existing methods on CIFAR10 and DRIFT benchmarks, while achieving similar or better accuracy.
more » « less
Full Text Available
Attack-Resilient ImageWatermarking Using Stable Diffusion

Zhang, L; Liu, X; i_Martin, V; Bearfield, X; Brun, Y; Guan, H (December 2024, 38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Full Text Available
APOLLO: SGD-like Memory, AdamW-level Performance

Zhu, H; Zhang, Z; Cong, W; Liu, X; Park, S; Chandra, V; Long, B; Pan, D Z; Wang, Z; Lee, J (February 2025, https://doi.org/10.48550/arXiv.2412.05270)

Large language models (LLMs) are notoriously memory-intensive during training, particularly with the popular AdamW optimizer. This memory burden necessitates using more or higher-end GPUs or reducing batch sizes, limiting training scalability and throughput. To address this, various memory-efficient optimizers have been proposed to reduce optimizer memory usage. However, they face critical challenges: (i) reliance on costly SVD operations; (ii) significant performance trade-offs compared to AdamW; and (iii) still substantial optimizer memory overhead to maintain competitive performance. In this work, we identify that AdamW's learning rate adaptation rule can be effectively coarsened as a structured learning rate update. Based on this insight, we propose Approximated Gradient Scaling for Memory-Efficient LLM Optimization (APOLLO), which approximates learning rate scaling using an auxiliary low-rank optimizer state based on pure random projection. This structured learning rate update rule makes APOLLO highly tolerant to further memory reductions while delivering comparable pre-training performance. Even its rank-1 variant, APOLLO-Mini, achieves superior pre-training performance compared to AdamW with SGD-level memory costs. Extensive experiments demonstrate that the APOLLO series performs on-par with or better than AdamW, while achieving greater memory savings by nearly eliminating the optimization states of AdamW. These savings provide significant system-level benefits: (1) Enhanced Throughput: 3x throughput on an 8xA100-80GB setup compared to AdamW by supporting 4x larger batch sizes. (2) Improved Model Scalability: Pre-training LLaMA-13B with naive DDP on A100-80GB GPUs without system-level optimizations. (3) Low-End GPU Friendly Pre-training: Pre-training LLaMA-7B on a single GPU using less than 12 GB of memory with weight quantization.
more » « less
Free, publicly-accessible full text available February 17, 2026
FinLoRA: Finetuning Quantized Financial Large Language Models Using Low-Rank Adaptation on GPUs

Wang, D; Kim, D; Jin, B; Zhao, X; Fu, T; Yang, S Y; Liu, X (December 2024, arXiv preprints)

Finetuned large language models (LLMs) have shown remarkable performance in financial tasks, such as sentiment analysis and information retrieval. Due to privacy concerns, finetuning and deploying financial LLMs (FinLLMs) locally are crucial for institutions and individuals. In this paper, we employ quantized low-rank adaptation (QLoRA) to finetune FinLLMs, which leverage low-rank structure and quantization technique to significantly reduce computational requirements while maintaining model performance. We also employ data and pipeline parallelism to enable local finetuning on commodity GPUs. Experiments on financial datasets validate the efficacy of our approach in yielding notable improvements over the base models.
more » « less
Full Text Available
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

Zhou, Y; Zhu, J; Xu, P; Liu, X; Wang, X; Koutra, D; Ai, W; Huang, F (November 2024, Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’24))

Full Text Available
Collaborative Alignment for Recommendation

Wang, C; Yang, L; Liu, Z; Liu, X; Yang, M; Liang, Y; Yu, PS (October 2024, ACM CIKM)

Full Text Available

« Prev Next »

Search for: All records